MARBLES: Mining Association Rules Buried in Long Event Sequences

نویسندگان

  • Boris Cule
  • Nikolaj Tatti
  • Bart Goethals
چکیده

Sequential pattern discovery is a well-studied field in data mining. Episodes are sequential patterns that describe events that often occur in the vicinity of each other. Episodes can impose restrictions on the order of the events, which makes them a versatile technique for describing complex patterns in the sequence. Most of the research on episodes deals with special cases such as serial and parallel episodes, while discovering general episodes is surprisingly understudied. This is particularly true when it comes to discovering association rules between them. In this paper we propose an algorithm that mines association rules between two general episodes. On top of the traditional definitions of frequency and confidence, we introduce two novel confidence measures for the rules. The major challenge in mining these association rules is pattern explosion. To limit the output, we aim to eliminate all redundant rules. We define the class of closed association rules, and show that this class contains all non-redundant output. To make the algorithm efficient, we use further pruning steps along the way. First of all, we generate only free and closed frequent episodes from which we create candidate rules, we speed up the evaluation of the rules, and finally prune the remaining non-closed rules from the output.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mining Association Rules in Long Sequences

Discovering interesting patterns in long sequences, and finding confident association rules within them, is a popular area in data mining. Most existing methods define patterns as interesting if they occur frequently enough in a sufficiently cohesive form. Based on these frequent patterns, association rules are mined in the traditional manner. Recently, a new interestingness measure, combining ...

متن کامل

Discovering Representative Episodal Association Rules from Event Sequences Using Frequent Closed Episode Sets and Event Constraints

Discovering association rules from time-series data is an important data mining problem. The number of potential rules grows quickly as the number of items in the antecedent grows. It is therefore difficult for an expert to analyze the rules and identify the useful. An approach for generating representative association rules for transactions that uses only a subset of the set of frequent itemse...

متن کامل

PROWL: An Efficient Frequent continuity Mining Algorithm on Event Sequences

Mining association rule in event sequences is an important data mining problem with many applications. Most of previous studies on association rules are on mining intra-transaction association, which consider only relationship among the item in the same transaction. However, intra-transaction association rules are not a suitable for trend prediction. Therefore, inter-transaction association is ...

متن کامل

A Novel Boolean Algebraic Framework for Association and Pattern Mining

Data mining has been defined as the nontrivial extraction of implicit, previously unknown and potentially useful information from data. Association mining and sequential mining analysis are considered as crucial components of strategic control over a broad variety of disciplines in business, science and engineering. Association mining is one of the important sub-fields in data mining, where rul...

متن کامل

Introducing an algorithm for use to hide sensitive association rules through perturb technique

Due to the rapid growth of data mining technology, obtaining private data on users through this technology becomes easier. Association Rules Mining is one of the data mining techniques to extract useful patterns in the form of association rules. One of the main problems in applying this technique on databases is the disclosure of sensitive data by endangering security and privacy. Hiding the as...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012